Skip to content

adpat move_intermediate_cache for sglang prefix + mtp#422

Merged
RuixuanZhang06 merged 2 commits intosgl-project:mainfrom
silencejade:br_fix_stateupdate
Apr 3, 2026
Merged

adpat move_intermediate_cache for sglang prefix + mtp#422
RuixuanZhang06 merged 2 commits intosgl-project:mainfrom
silencejade:br_fix_stateupdate

Conversation

@silencejade
Copy link
Copy Markdown
Contributor

@silencejade silencejade commented Apr 2, 2026

Dependence

sgl-project:
Ascend/sglang#202

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request decouples source and destination indices in the Mamba state update Triton kernel and its wrapper function, allowing for more flexible cache movement. Previously, a single index tensor was used for both source and destination. Feedback suggests strengthening input validation by asserting that all index tensors (dst_indices_tensor, src_indices_tensor, and last_steps_tensor) have matching lengths and ensuring they are contiguous int32 tensors to prevent potential out-of-bounds access or type mismatches in the Triton kernel.

@RuixuanZhang06 RuixuanZhang06 merged commit d16fb13 into sgl-project:main Apr 3, 2026
3 of 6 checks passed
@silencejade silencejade deleted the br_fix_stateupdate branch April 7, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants